# Python Basics

We are working towards being able to implement the retirement model with salary growing with promotions. But before we get there, we need to learn some additional Python.

## Conditionals

In Python, we can write `if` to check some condition, just like how we can do `=IF` in Excel. The functionality in Excel is much more limited: either return one value if true or another if false. In Python, the `if` statement just decides whether to run the code under it, and so it is infinitely flexible to do whatever you want after meeting that condition.

### A First `if` Statement 

This is a toy example, but it should show generally how `if` works.

In [1]:
if 'cat' == 'cat':
    print('got a cat')

got a cat


In [2]:
if 'cat' == 'dog':
    print('got a cat')

We can see that when the condition is true, we run the code which is under the `if`. Conversely if it is false, the code does not get run.

Just like with a `for` loop, you need to indent the code which will be run only if the condition is true. 

The way the `if` statement works, is it goes and evaluates the condition, checking whether it is `True` or is `False`, then if it is `True` it will run the code and not if it is `False`. We can see that conditions evaluate to `True` or `False` by checking their value in Jupyter.

In [3]:
'cat' == 'cat'

True

In [4]:
'cat' == 'dog'

False

So basically, the above statements were equivalent to `if True` and `if False`, respectively:

In [5]:
if True:
    print('got a cat')

got a cat


In [6]:
if False:
    print('got a cat')

### Comparisons vs. Assignments

When comparing whether two objects are equal, use `==` (two equals), which assigning, use `=` (one equals).

In [7]:
a = 5

In [8]:
a

5

In [9]:
a == 5

True

In [10]:
a == 6

False

In [11]:
a

5

As you can see, we assigned `a` to be `5` and then checked whether its value was equal to 5 or equal to 6. The value was not reassigned to `6`, it stayed as `5` because we were comparing and not assigning. In the same vein, it is invalid to use a single equals in a conditional because it's not a comparison.

In [12]:
if a = 5:
    print('a is 5')

SyntaxError: invalid syntax (<ipython-input-12-fa53bbcdf3b5>, line 1)

It gives us `invalid syntax` because we are trying to assign where it should be a comparison. The following works fine:

In [13]:
if a == 5:
    print('a is 5')

a is 5


### A First `else` Statement

So we said from Excel, with `=IF`, it returns one value if true and another if false. So far the `if` that we have looked at only runs the code if the condition is true. So how do we run code if the condition is false? Enter the `else` statement.

In [14]:
my_var = 'abc'

if my_var == 'nope':
    print('this should not run')
else:
    print('runs if condition is False')

runs if condition is False


The `else` statement should come back on the same level of indent as the `if` statement, again have a colon at the end, and then the code which runs if the condition is `False` should be indented again. 

Of course the `else` will not run if the condition is `True`:

In [17]:
if my_var == 'abc':
    print('this should run')
else:
    print('runs if condition is False')

this should run


### Chaining Conditions with `elif`

`elif` is short for else if, and it means if not the last condition, but yes this condition. It is useful if you want to check across multiple values. For example:

In [19]:
price = 100

if price < 0:
    print("really, you're going to pay me?")
elif price < 50:
    print("it's pretty cheap")
elif price < 150:
    print("it's priced in the mid-range")
else:
    print("this thing is expensive")

it's priced in the mid-range


This is mainly a convenience feature. It would be possible to write the same conditions with just `if`s, but it would take more code:

In [21]:
price = 100

if price < 0:
    print("really, you're going to pay me?")
if price >= 0 and price < 50:
    print("it's pretty cheap")
if price >= 50 and price < 150:
    print("it's priced in the mid-range")
if price >= 150:
    print("this thing is expensive")

it's priced in the mid-range


## Working more with Lists

So far we have looked at how to construct a list as inputs, and how to iterate over the list with a `for` loop. But there's a lot more we can do with them.

### Adding to Lists

Often lists are useful as a container to store our outputs. So we'll start with an empty list, and then add items to it as we go. Creating an empty list is simple. It's the same syntax as we did before to make a list, but just putting nothing in it.

In [39]:
my_list = []
my_list

[]

We can see that an empty list is represented by square brackets `[]`.

We can add an item to the end of the list using `append`.

In [40]:
my_list.append(5)
my_list

[5]

You can see it is always at the end.

In [41]:
my_list.append(6)
my_list.append(7)
my_list

[5, 6, 7]

### An Aside to Python Indexing

How would you get the second item from a list? It is the item with index 1. This seems very unintuitive as a new programmer, but it is the case in nearly all programming languages that indexes are **zero-based**.

#### What is Zero-Based Indexing?

This means that the first item in a list is accessed by 0, the second item is accessed by 1, and so on. So whatever item you want to get, subtract 1 from what you would intuitively think is the index. To get the 10th item, access index 9.

#### But Why?

I can't give you a great answer, but you [can read more here](https://softwareengineering.stackexchange.com/questions/110804/why-are-zero-based-arrays-the-norm).

### Back to Adding to Lists

We can also add items at any position in the list using `insert`. `insert` takes two arguments. The first should be the zero-based index at which to add the item, and the second is the item to add.

In [43]:
my_list.insert(0, 'woo')
my_list

['woo', 5, 6, 7]

In [44]:
my_list.insert(2, 10)
my_list

['woo', 5, 10, 6, 7]

As you can see, inserting at index 0 made it the first item, and inserting at index 2 made it the third item. You can also see that you can mix multiple types in a list, there we have both strings and numbers (though usually it's not a good idea).

### List Indexing

Now that we've learned what zero-based indexing means, we can use it to look up items in the list.

In [45]:
my_list

['woo', 5, 10, 6, 7]

In [46]:
my_list[0]

'woo'

In [47]:
my_list[4]

7

You can see that accessing the 0-index element gets the first item in the list, and the 4-index item gets the 5th item in the list. If we try to access an index which is outside of the list length, we'll get an `IndexError: list index out of range`:

In [48]:
my_list[1000]

IndexError: list index out of range

We can also use negative numbers for indices. When they are negative, that means we are counting from the end of the list, backwards. So `-1` is the last item in the list, `-2` is the second to last item, and so on.

In [50]:
my_list[-1]

7

In [51]:
my_list[-2]

6

### List Slicing

You can also pull out multiple items from a list. There we use a colon to denote a range of indices. The index on the left side is included, and everything up until, but not including, the index on the right side will be included.

In [53]:
my_list[1:3]

[5, 10]

You can interpret that as from the 2nd item up until, but not including, the fourth item. AKA items 2 and 3.

You can also leave one end of the colon empty.

In [54]:
my_list[1:]

[5, 10, 6, 7]

That says, give from the 2nd item until the end.

In [55]:
my_list[:-1]

['woo', 5, 10, 6]

That says, give me up until, but not including the last item.

## Functions

A function wraps a logical process up into a single unit. The biggest reason to use them is to avoid repeating code. If you find yourself copy-pasting code repeatedly, you probably should be making it into a function instead. They also help organize your code, and can help avoid mistakes by reusing the same code rather than writing it again.

They will become more and more useful as you do more complex things with code. Hopefully you can see the value from this example but it will become more apparent over time why they are useful.

### A Motivating Example

Let's say we've got a some lists of some stock prices over time for a few different companies, and we want to report some summary statistics on each one. Without functions, it may look something like this (there are better ways to write this logic as well besides function use, but we haven't covered them yet):

In [59]:
prices_aapl = [300.12, 310.20, 299.50, 302.5, 305.6]
prices_msft = [162.17, 165.89, 172.15, 155.18, 158.96]
prices_amzn = [1892.15, 1864.25, 1795.64, 1804.35, 1824.69]

avg_price_aapl = sum(prices_aapl)/len(prices_aapl)
max_price_aapl = max(prices_aapl)
min_price_aapl = min(prices_aapl)
last_price_aapl = prices_aapl[-1]
print(f'The last price for AAPL was ${last_price_aapl:.2f}. Average: ${avg_price_aapl:.2f}. Max: ${max_price_aapl:.2f}. Min: ${min_price_aapl:.2f}')

avg_price_msft = sum(prices_msft)/len(prices_msft)
max_price_msft = max(prices_msft)
min_price_msft = min(prices_msft)
last_price_msft = prices_msft[-1]
print(f'The last price for MSFT was ${last_price_msft:.2f}. Average: ${avg_price_msft:.2f}. Max: ${max_price_msft:.2f}. Min: ${min_price_msft:.2f}')

avg_price_amzn = sum(prices_amzn)/len(prices_amzn)
max_price_amzn = max(prices_amzn)
min_price_amzn = min(prices_amzn)
last_price_amzn = prices_amzn[-1]
print(f'The last price for AMZN was ${last_price_amzn:.2f}. Average: ${avg_price_amzn:.2f}. Max: ${max_price_amzn:.2f}. Min: ${min_price_amzn:.2f}')

The last price for AAPL was $305.60. Average: $303.58. Max: $310.20. Min: $299.50
The last price for MSFT was $158.96. Average: $162.87. Max: $172.15. Min: $155.18
The last price for AMZN was $1824.69. Average: $1836.22. Max: $1892.15. Min: $1795.64


We can see that this code is very repetitive. The last three blocks are the same thing with just switching out `aapl`, `msft`, and `amzn`. It is time-consuming and error-prone to do things this way. Let's rewrite it with a function to see how it simplifies, then dig into functions in general.

In [61]:
prices_aapl = [300.12, 310.20, 299.50, 302.5, 305.6]
prices_msft = [162.17, 165.89, 172.15, 155.18, 158.96]
prices_amzn = [1892.15, 1864.25, 1795.64, 1804.35, 1824.69]

def price_summary_statistics(prices, ticker):
    """
    Takes a list of prices and the ticker, and prints summary statistics on the price for that ticker.
    """
    avg_price = sum(prices)/len(prices)
    max_price = max(prices)
    min_price = min(prices)
    last_price = prices[-1]
    print(f'The last price for {ticker} was ${last_price:.2f}. Average: ${avg_price:.2f}. Max: ${max_price:.2f}. Min: ${min_price:.2f}')

price_summary_statistics(prices_aapl, 'AAPL')
price_summary_statistics(prices_msft, 'MSFT')
price_summary_statistics(prices_amzn, 'AMZN')

The last price for AAPL was $305.60. Average: $303.58. Max: $310.20. Min: $299.50
The last price for MSFT was $158.96. Average: $162.87. Max: $172.15. Min: $155.18
The last price for AMZN was $1824.69. Average: $1836.22. Max: $1892.15. Min: $1795.64


As you can see, the code now looks less repetitive. It is also clearer what it's doing, and that it's doing the same thing for each of the three sets of prices. If we later decided we wanted to add a new summary statistic, it would be in one spot instead of three. There is also no chance of our three different summaries ending up different by a typo, as they are all using the exact same logic.

### How to Use Functions

Functions must first be defined, then used. Functions are defined with a `def` statement. After they are defined, you can then call them using the name from the `def` statement. Here's a simpler function which just adds two numbers together:

In [64]:
def add_em(a, b):
    return a + b

We can see that running that definition did not produce any output. You must call the function to actually use it. We just have it now available to us. Let's actually use it.

In [66]:
my_result = add_em(2, 5)
my_result

7

When you add a `return` to a function, that value will be sent outside as the result of the function. In the first example function, I dind't use `return` because I didn't need to store any values, just `print` them. But usually you will be using `return` in functions. No matter what you do inside, only what is `return`ed will come outside the function. Any variables you define inside the function will only last inside the function, then they will be removed once the function is done executing.

In [69]:
def return_5(a):
    my_value = a + 10
    print(f'my value is {my_value}')
    return 5

my_res = return_5(100)
my_res

my value is 110


5

In [68]:
my_value

NameError: name 'my_value' is not defined

We can see that regardless of what was going on in the function, it `return`ed `5`, and so that's what came outside the function as `my_res`. We can also see that `my_value` was defined and we could use it in the function, but it only lasts inside the function, so when we try to reference it later it's not defined.

It is possible to include a description of the function, called a `docstring`. This is great for readability. Let's try again with our last function.

In [70]:
def return_5(a):
    """
    A function that always returns 5, regardless of what you pass to it. It will also print "my value is " and insert 10
    plus the passed value.
    """
    my_value = a + 10
    print(f'my value is {my_value}')
    return 5

A `docstring` should be the first thing under the function and should be surrounded by triple quotes. If you follow this, it will become an official documentation of your function. To see this, it will then integrate into Jupyter's ? shortcut:

In [71]:
return_5?

[0;31mSignature:[0m [0mreturn_5[0m[0;34m([0m[0ma[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
A function that always returns 5, regardless of what you pass to it. It will also print "my value is " and insert 10
plus the passed value.
[0;31mFile:[0m      ~/Dropbox/UF/Teaching/Modeling/Me/repos/fin-model-course/fin_model_course/Examples/Intro/Python/<ipython-input-70-df26c15ab4b7>
[0;31mType:[0m      function


We can also provide defaults for function arguments, so that you don't have to pass them.

In [72]:
def get_beginning_of_string(string, num_chars=5):
    """
    Gets the first num_chars of a string
    """
    return string[:num_chars]

get_beginning_of_string('my string')

'my st'

We can see that `num_chars` took a value of 5 by default, but we can also pass a different value:

In [73]:
get_beginning_of_string('my string', 6)

'my str'

## Exploring Data Types

In Python, everything is an object except for variable names, which are references to objects. Every object has a type. We have learned about strings, numbers,
lists, and booleans (True, False).

For reference on all this material, see https://docs.python.org/3/library/stdtypes.html

### Strings

In [1]:
a = 'abc'

Here `a` is a variable name, it is a reference that points to the object `'abc'` which is a string object.

In [2]:
a

'abc'

In [3]:
type(a)

str

We can use `type()` to check the type of any object.

We can do a lot with strings. On a string variable, hit tab after it's name and then ., for example: `a.` Since I can't get that printed out in the completed notebook, I'll create a quick helper function to get the methods.

In [9]:
def get_methods(obj):
    """
    Displays the methods and attributes on an object.
    
    Don't worry about how this is implemented, I am just adding this to print out the methods in the notebook, which you
    can do without this function by typing the name of the variable then . and then pressing tab.
    """
    return [method for method in dir(type(obj)) if not method.startswith('__')]

In [10]:
get_methods(a)

['capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

As an example, let's try `islower`:

In [11]:
a.islower()

True

In [12]:
'abc'.islower()

True

In [13]:
'AbC'.islower()

False

### Numbers

So far I have just said that numbers are a type in Python, but this is
a simplification. There are two main types of numbers in python: `float` and `int`
corresponding to a floating point number and an integer, respectively
An int is a number without decimals, while a float has decimals,
regardless of whether they are zero.


For example, `3.5` and `3.0` are floats, while `3` is an int, even though
`3.0 == 3` is True




In [14]:
type(3.5)

float

In [15]:
type(3.0)

float

In [16]:
type(3)

int

In [17]:
3 == 3.0

True

In [18]:
type(3) == type(3.0)

False

Usually, this doesnâ€™t matter. But to loop a number of times, you
must pass an int.

In [20]:
for i in range(5):
    print(f'loop number {i + 1}')

loop number 1
loop number 2
loop number 3
loop number 4
loop number 5


This uses a float to try to loop over, so it will raise a `TypeError: 'float' object cannot be interpreted as an integer`

In [21]:
for i in range(5.0):
    print(f'loop number {i + 1}')

TypeError: 'float' object cannot be interpreted as an integer

### Tuples

A tuple is very similar to a `list`, the main difference being that after it is created, it will not change. Whether an object may change after it is creates is  referred to as mutability, and because the `tuple` cannot change it is immutable, while the list is mutable. 

To create a tuple, use parentheses instead of brackets.

In [23]:
my_tup = (1, 2, 3)
my_tup

(1, 2, 3)

You can also convert a list using the `tuple` constructor (reverse works too)

In [24]:
my_list = ['a', 'b', 'c']
my_second_tup = tuple(my_list)
my_second_tup

('a', 'b', 'c')

In [25]:
list(my_second_tup)

['a', 'b', 'c']

Be careful if you try to create a tuple with a single element. You must include a trailing comma.

In [26]:
single_elem_tup = ('a',)
single_elem_tup

('a',)

In [29]:
type(single_elem_tup)

tuple

If you try to do it without the comma, Python will interpret as a logical grouping and not a tuple.

In [27]:
not_a_tup = ('a')
not_a_tup

'a'

In [28]:
type(not_a_tup)

str

Tuples can be indexed just like lists.

In [41]:
my_second_tup[0]

'a'

In [42]:
my_second_tup[:-1]

('a', 'b')

### Dictionaries

A `dict` short for dictionary, stores a mapping. Use them if you want
to store values associated to other values. Define them with curly braces and colons:

In [30]:
my_dict = {'a': 1, 'b': 2}
my_dict

{'a': 1, 'b': 2}

Look up items by brackets.

In [32]:
my_dict['a']

1

In [33]:
my_dict['b']

2

If you try to access a key which doesn't exist, it will raise a `KeyError`.

In [34]:
my_dict['not there']

KeyError: 'not there'

We will come back to dicts later in the course, but I wanted to
introduce them now as they are a very fundamental data type.

### Nesting Data Types

In general in Python, things can be arbitrarily be used inside of each other in a nested fashion. Loops can be inside loops, loops inside if, etc. as much as you want. The same concept exists with data structures. For example, a very common nested data structure is a list of dictionaries:

In [36]:
inputs = [
    {
        'name': 'John',
        'weight': 180
    },
    {
        'name': "Sarah",
        'weight': 140
    }
]

for inp in inputs:
    print(f'My name is {inp["name"]} and my weight is {inp["weight"]}')

My name is John and my weight is 180
My name is Sarah and my weight is 140


This can go as deep as you want it. For example (no need to understand what's going on here, just showing that things can be nested.)

In [40]:
inputs = [
    {
        'name': 'John',
        'friends': [
            {
                'name': 'Joe',
                'shared_activities': ('Jogging', 'Camping')
            }
        ]
    },
    {
        'name': 'Sarah',
        'friends': [
            {
                'name': 'Martha',
                'shared_activities': ('Rock Climbing', 'Video Games')
            },
            {
                'name': 'Jimmy',
                'shared_activities': ('Backgammon',)
            }
        ]
    },
]

for inp in inputs:
    friend_num = 0
    for friend in inp['friends']:
        friend_num = friend_num + 1
        if friend_num == 1:
            print(f'{inp["name"]} hangs out with {friend["name"]} with shared activities: {", ".join(friend["shared_activities"])}')
        else:
            print(f'{inp["name"]} also hangs out with {friend["name"]} with shared activities: {", ".join(friend["shared_activities"])}')

John hangs out with Joe with shared activities: Jogging, Camping
Sarah hangs out with Martha with shared activities: Rock Climbing, Video Games
Sarah also hangs out with Jimmy with shared activities: Backgammon


## Working with Classes

In Python, everything is an object except for variable names, which
are references to objects. Strings, floats, ints, lists, and tuples are types of objects. There are
many more types of objects and users can define their own types of
objects. A class is a definition for a type of object. It defines how it is created,
the data stored in it, and the functions attached to it. We can write our own classes to create new types of objects to work
with. 

From a single class definition, an unlimited number of objects can be
created.

### Built-in Classes

Hint: These are the Built-in types described above! The class is just the definition for the type. The only difference with the built-in types is that most of them have some more convenient way to create them, and you never have to import them.

In [50]:
my_str = '10'  # more convenient way to create objects of type str
my_str

'10'

In [51]:
type(my_str)

str

There we have created an object of type `str`. We can do the same thing with the actual constructor.

In [52]:
my_str = str(10)  # also creates an object of type str. More similar to how objects with custom types will be created
my_str

'10'

In [53]:
type(my_str)

str

### Custom Classes

We will not cover in this course how to make general classes. If you want to see an example, open `car_example.py`. We will instead focus on using classes, and also creating dataclasses (next section).

I have already defined a class `Car` in `car_example.py`. If you put it in the same folder as this notebook, that will allow you to `import car_example` or `from car_example import Car`. The `from` import syntax just allows you to access whatever you import without the module name. I would have to do `car_example.Car` if I just used `import car_example`.

In [54]:
from car_example import Car

my_car = Car('Subaru', 'Forester')
my_car

Car(make=Subaru, model=Forester)

In [55]:
my_car.drive()

The Subaru Forester is driving away!
The gas is now at 40%


In [56]:
for i in range(5):
    my_car.drive()

The Subaru Forester is driving away!
The gas is now at 30%
The Subaru Forester is driving away!
The gas is now at 20%
The Subaru Forester is driving away!
The gas is now at 10%
The Subaru Forester is driving away!
The gas is now at 0%
The Subaru Forester is out of gas!


While this is a toy example, it shows in general how to construct an object of a custom type, and how to call the functions (methods) of the object. This will be useful as many of the modules we can `import` have their own custom types.

### Dataclasses

We can use `dataclasses` as a simple way to create a class that we can use to store data. Before we can use them, you must `from dataclasses import dataclass`. This is a built-in module but needs to be imported. Here is an example:

In [61]:
from dataclasses import dataclass


@dataclass
class ModelInputs:
    pmt: float = 1000
    interest_rates: tuple = (0.05, 0.06, 0.07)
    

There are a few things to break down here. First is this mysterious `@dataclass`. This is a decorator, which is a more advanced usage of Python, but we won't focus on them in this course. Just know you have to put that above the class definition.

Next is the `class ModelInputs:` line. Just like `def`, `for`, and `if`, we need a colon and an indent for Python to understand the class definition.

Then we have `pmt: float = 1000`. We can break this down into three parts. First is `pmt`. This will be the name of the attribute in the object containing that data, so we will access `.pmt` of the created object to get that data. Next is a colon. Then after the colon, we specify the type of the variable. Since payment can be any number, we will assign it the `float` type. Then, optionally, we can include a default for the value. `pmt` will default to `1000` if it is not passed.

The next line has a similar syntax. It is definining `.interest_rates` in the created object. The difference here is the type assigned to the variable, this one is a tuple. You may think to use lists, but this is a bad idea for our model data since it should not be changing after it is created. Knowing this, Python will not even allow you to set a default value as a list without using a special syntax.

Now we have defined the `ModelInputs` `dataclass`. Let's use it.

In [62]:
data = ModelInputs()
data

ModelInputs(pmt=1000, interest_rates=(0.05, 0.06, 0.07))

We can see that because we provided defaults for all the inputs, we didn't even need to pass anything to create the data. Then we can access the individual attributes of the data:

In [63]:
data.pmt

1000

In [64]:
data.interest_rates

(0.05, 0.06, 0.07)

We can also pass values to override those defaults.


In [66]:
data = ModelInputs(interest_rates=(0.03, 0.04))
data

ModelInputs(pmt=1000, interest_rates=(0.03, 0.04))

In [68]:
data.interest_rates

(0.03, 0.04)

#### Why use `Dataclasses`?

If you write a function that works on your data, without using `dataclasses`, it would be set up something like this:

In [70]:
pmt = 1000
interest_rates = (0.03, 0.04)

def principals_from_pmt_and_rates(pmt, rates):
    for rate in rates:
        principal = pmt/rate
        print(f'The principal for the {rate:.0%} rate is ${principal:,.2f}')
        
principals_from_pmt_and_rates(pmt, interest_rates)

The principal for the 3% rate is $33,333.33
The principal for the 4% rate is $25,000.00


Using the dataclass above, it becomes:

In [71]:
data = ModelInputs(
    pmt=1000,
    interest_rates=(0.03, 0.04)
)

def principals_from_pmt_and_rates(data):
    for rate in data.interest_rates:
        principal = data.pmt/rate
        print(f'The principal for the {rate:.0%} rate is ${principal:,.2f}')
        
principals_from_pmt_and_rates(data)

The principal for the 3% rate is $33,333.33
The principal for the 4% rate is $25,000.00


This may not look a whole lot different, but you'll notice we're now only passing a single argument to the function instead of two. By the time you've got 15 different inputs and 10 different functions, this can save a lot of typing! Further, we can at any time tab complete all the inputs in the model by pressing tab after `data.`